ROP (Raster Operation) hardware changes:
Virtually all of NVIDIA’s GeForce 8800 graphics processing unit has been built from the ground up, and the pixel output engines are no different in that respect. They were completely redesigned and come with lots of new features – the most pleasing to see is support for HDR+AA, as it’s a feature that ATI has been touting on its Radeon X1000-series video cards ever since their launch. That’s not all though, as NVIDIA has stepped up the image quality game with a new anti-aliasing technique – we’ll come to that in due course.
GeForce 8800 GTX has six pixel output engine partitions, with each partition being capable of processing four pixels per clock. In traditional terms, GeForce 8800 GTX has a total of 24 pixel output engines and is capable of 24 pixels per clock with colour and Z processing. For Z-only processing, the ROP engines are capable of 192 pixels per clock if a single sample is used in each pixel.
The Raster Operation engine supports NVIDIA’s previous anti-aliasing modes, namely multi-sampling, super-sampling and
transparency adaptive anti-aliasing. NVIDIA has also added a new anti-aliasing technique, known as Coverage Sampling anti-aliasing – we’ll come to that in a minute.
All of these anti-aliasing modes support simultaneous use of FP16 or FP32 render targets, meaning that it can do up to 128-bit HDR with anti-aliasing at the same time. The pixel output engines also come with eight
multiple render targets, each of which can define its own unique colour format. There is also a new colour and Z compression mode designed with higher performance and efficiency in mind.
Z-buffer changes:
Z-buffers have been used in modern GPU’s for some time now; it is designed to improve efficiency by selectively choosing which elements in a scene need to be rendered. Each 3D frame is processed and then converted into a 2D image ready for displaying on your monitor and there’s no point in rendering something that is out of sight – doing so would be an incredibly inefficient way of constructing a 3D scene.
simple 3D scene & its Z-buffer representation – Source: Wikipedia Z-culling removes non-visible pixels at high speed during the rasterisation stage of the pipeline. NVIDIA’s GeForce 8800 GTX can cull pixels four times as fast as GeForce 7900 GTX, but neither GPU is capable of catching all situations where pixels are not visible. In the past, Z-culling has been one of the final stages in the ROP unit – this is a bit of a problem because the pixels have already been processed by the time they get to the Z-buffer. This is an incredible waste of resources, especially when some of the more complex shaders can consist of thousands of instructions, only for the processed pixel to be occluded from the final scene.
NVIDIA has attempted to resolve these problems with an Early Z implementation. This is designed to test the Z values of pixels before they enter the pixel shader in order to alleviate a lot of the pointless processing time devoted to pixels that never make it into the final 3D scene. It makes complete sense to me and it is surprising that nobody has thought to do something like this before.
Want to comment? Please log in.